Unclassified _ 

SECURIIV  CLASSiFItAIION  UF  IMli  PAGE 


r:  -  r 


AD-Aais  323_ 

.w..«u  SCHEDULE 


4.  PERFORMING  ORGANI2ATION  BEPORI  NUMBER(S) 


REPORT  DOCUMENTAMON  PAGE 

I  lb  RESTRICTIVE  MARKINGS 


i  OISIRIBUTION/ AVAILA8ILIFY  Of  REPORT 

Unlimited 

5.  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 


TR  89-1039  _ 


6a.  NAME  OF  PERFORMING  ORGANIZATION  6b  OFFICE  SYMBOL  /a  NAME  OF  MONITORING  ORGANIZATION 

(If  gpplictblf) 


(If  ippliitblt) 


Cornell  University 


.  Scat*,  and  ZIP  Code) 
of  Computer  Science 
,  Cornell  University 
’  1^853 


8a.  NAME  OF  FUNDING  /  SPONSORING 
ORGANIZATION 

f f ice  of  Naval  Research 


8c.  ADDRESS  (CilK,  Slate,  and  Z/P  Code) 
800  North  Quincy  Street 
Arlington,  VA  22217-5000 


H  TITLE  (/nc/ude  Securtiy  Cfaiidicalion) 

An  Assertlonal  Characterization  of  Seriallzabillty 


12.  PERSONAL  AUTHOR(S) 

E.  Robert  McCurley  and  Fred  B.  Schneider _ 


Office  of  Naval  Research - 


?b.  ADDRESS  (C/ty,  Stale,  and  ZIP  Code) 

800  North  Quincy  St. 

Arlington,  VA  22217-5000 

8b.  OFFICE  SYMBOL  9  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 


N000014-86-K-0092 


to  SOURCE  OF  FUNDING  NUMBERS 

PROGRAM  PROJECT  1 

ELEMENT  NO  NO.  ' 


WORK  UNIT 

accession  no 


IJa  TYPE  OF  REPORT 

Interim 


16  SUPPLEMENTARY  NOTATION 


It 3b.  TIME  COVERED 

•  FROM  _  ro 


14.  DATE  OF  REPORT  {Yt»r,  Month,  Oay)  |'5  PAGE  COUNT 

September  28,  1989  I  1° 


17. 

COSATI  CODES  1 

FIELD  1 

1  GROUP  1 

SUBGROUP  t 

18  SUBJECT  TERMS  (Continue  on  reverje  iT  necejfary  and  identify  by  block  number) 
serializablllty,  database  systems,  concurrency  control, 
verification,  assertlonal  reasoning  . 


t9.  ABSTRACT  (Continue  on  reverse  if  neceitary  and  identify  by  block  number) 

'^Serializablllty  is  usually  defined  operationally  in  terms  of  sequences  of  operations. 
This  paper  gives  another  definition  of  serializablllty — in  terms  of  sequences  of  states. 
It  also  shows  how  this  definition  can  be  used  to  prove  correctness  of  solutions  to  the 
concurrency  control  problem.  , 


DTIC 

ELECTE 
OCT  1 1 1989 


D 


20.  DISTRIBUTION! AVAILABILITY  OF  ABSTRACT 

Qunclassifieo/unlimited  □  same  AS  rpt.  □  OTIC  users 

21,  ABSTRACT  SECURITY  CLASSIFICATION 

22a  NAME  OF  RESPONSIBLE  INDIVIDUAL 

Fred  B.  Schneider 

22b.  TELEPHONE  ttntiude  Area  Code) 

(607)  255-9221 

22c.  OFFICE  SYMBOL 

DD  FORM  1473, 84  MAR 


*3  APR  edition  be  o»ed  until  exhautted. 
All  other  editions  are  obsolete. 


SECURITY  CLASSIFICATION  OF  THIS  PAGE. 


89  lO  11009 


•9 


An  Assertional  Characterization  of 
Serializability* 

E.  Robert  McCurley 

School  of  Information  and  Computer  Science 
Georgia  Institute  of  Technology 
Atlanta,  Georgia  S0SS2 

Fred  B.  Schneider 
Department  of  Computer  Science 
Cornell  University 
Ithaca,  New  York  14858 

September  28,  1989 


Abstract 

Serializability  is  usually  defined  operationally  in  terms  of  sequences 
of  operations.  This  paper  gives  another  definition  of  serializability — 
in  terms  of  sequences  of  states.  It  also  shows  how  this  definition  can 
be  used  to  prove  correctness  of  solutions  to  the  concurrency  control 
problem. 

1  Introduction 

'"  a  database  system  is  a  computer  system  that  stores  information.  Consis¬ 
tency  constraints  restrict  system  states  to  those  that  are  meaningful;  trans¬ 
actions  are  designed  so  that  each  individually  transforms  the  database  from 

•This  material  is  based  on  work  supported  in  part  by  the  Office  of  Naval  Research  un¬ 
der  contract  N00014-86-K-0092,  the  National  Science  Foundation  under  Grant  No.  CCR- 
8701103,  and  Digital  Equipment  Corporation.  Any  opinions,  findings,  and  conclusions  or 
recommendations  expressed  in  this  publication  are  those  of  the  authors  and  do  not  reflect 
the  views  of  these  agencies. 
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one  consistent  state  to  another.  For  example,  in  a  database  for  a  banking 
application,  a  consistency  constraint  might  relate  cash-on-hand  to  the  sum 
of  the  account  balances;  a  transaction  for  a  deposit  would  adjust  both  an 
account  balance  and  cash-on-hand,  preserving  the  consistency  constraint,  k 

During  concurrent  execution  of  transactions,  operations  can  interleave  in 
ways  that  leave  the  database  in  an  inconsistent  state.  Avoiding  such  states 
is  called  the  concurrency  control  problem.  The  traditional  solution  to  this 
problem  is  based  on  serializability  [EGLT76],  which  asserts  that  transactions 
executing  concurrently  “behave  like”  they  are  run  serially,  one  after  another. 
If,  when  run  in  isolation,  each  transaction  transforms  the  database  from 
one  consistent  state  to  another  and  if  transactions  are  serializable,  then 
concurrent  execution  of  those  transactions  will  also  transform  the  database 
from  one  consistent  state  to  another.  Thus,  implementing  serializability 
solves  the  concurrency  control  problem. 

Serializability  is  usually  defined  (operationally)  in  terms  of  sequences, 
called  schedules,  that  list  operations  in  the  order  they  run.  This  would  seem 
to  preclude  use  of  programming  methodologies  based  on  assertions  about 
states — so  called  assertional  reasoning.  It  is  tempting  to  regard  this  as  a 
fundamental  limitation  of  such  methodologies.  In  this  paper,  we  show  this 
view  to  be  erroneous.  We  give  an  assertional  characterization  of  serializabil¬ 
ity  and  show  how  this  definition  can  be  used  to  prove  correctness  of  solutions 
to  the  concurrency  control  problem. 

We  proceed  as  follows.  In  Section  2,  we  present  a  database  system 
model  and  give  a  formal  definition  of  serializability.  Two  ways  to  specify 
serializability  using  formulas  of  a  Hoare-style  programming  logic  are  given 
in  Section  3.  In  Section  4,  we  discuss  possible  extensions  to  our  database 
system  model  and  their  implications.  We  conclude,  in  Section  5,  with  a 
comparison  of  our  definition  and  previous  ones. 

2  Serializability 

2.1  System  Model 

A  database  system  S  can  be  represented  by  a  triple  (  V,  C,  T),  where  V  is 
a  set  of  variables,  C  is  a  predicate  on  V,  and 

T:  [roll  ...||r;v-i] 

is  a  concurrent  program  in  which  each  transaction  r,  is  a  program  that 
references  only  variables  of  V. 
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Variables  in  V  represent  the  state  of  the  database,  including  but  not 
limited  to  the  part  that  actually  contains  application  data.^  C  is  the  con¬ 
sistency  constraint  of  the  database.  A  state  is  consistent  iff  C  is  true  in 
that  state.  Each  transaction  is  assumed  to  terminate  in  a  consistent  state 
when  run  alone  starting  in  one. 

An  example  of  a  database  system  is  given  in  Figure  1.  So  models  a 
database  that  maintains  an  unordered  collection  of  records,  represented  by 
variable  s  of  type  set.  Variables  tno, ..., tnjv-i  are  sequences,  containing 
records  that  have  or  will  be  inserted  into  s,  and  consistency  constraint  Cq 
requires  s  to  contain  a  subset  of  these.  Each  transaction  Add,  adds  the 
records  of  tn,  to  s.  Variable  t,  of  Add,  is  local  to  the  transaction,  hence 
excluded  from  Vq.  In  the  guard  of  the  loop,  |m,(  denotes  the  length  of  the 
sequence  in,.  Angle  brackets  and  “)”  surround  operations  that  execute 
atomically. 

So  =  (  Vq:  s,  ino, . . . ,  iny-i , 

Cq:  sC(tnoU---Um;v_i), 

To:  [Addo  ||  •••  ||  Adds-i]  ) 

Addf.  (t,:=0); 

do  ti  ^  |in,| 

(s:=sUin,(t,)); 

od 

Figure  1:  Database  System  Eq. 


2.2  Serializability 

Serializability  of  S  =  {  V,C,T)  can  be  understood  in  terms  of  an  abstract 
database  system  E'  =  (  V',  C',  T')  where  V  and  C  are  the  same  as  V  and 
C  of  E,  and 

T':  K  II  ...  II  ,  (1) 

is  the  concurrent  program  in  which  each  r-  is  ( r, ) ,  a  transaction  that  exe¬ 
cutes  the  same  operations  as  r,  but  as  a  single  atomic  operation.  Executions 

‘A  commit  flag  is  an  example  of  an  element  of  V  that  does  not  contain  application 
data. 
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of  T'  are  serial,  meaning  that  transactions  are  executed  one  after  another 
without  interleaving. 

Recall  that  serializability  asserts  that  every  (potentially  interleaved)  exe¬ 
cution  of  T  “behaves  like”  some  (serial)  execution  of  T'.  Since  if  one  system 
T  implements  another  one  T',  then  every  behavior  of  T  can  be  viewed  as 
a  behavior  of  T',  serializability  of  T  is  equivalent  to  T  implementing  T' 
[LamSS]. 

2.3  Defining  “Implements” 

In  [Pri87,  GP85],  a  formal  definition  in  terms  of  predicate  transformers  is 
given  for  when  one  program  implements  another.  This  definition  is  a  gen¬ 
eralization  of  that  presented  in  [Hoa72]  and  can  be  summarized  as  follows. 
Let  S'  be  a  program  operating  on  variables  X,  and  let  P  be  a  predicate 
on  X  such  that  S'  is  guaranteed  to  terminate  when  run  in  an  initial  state 
satisfying  P.  Let  5  be  a  program  on  variables  Y  (disjoint  from  X),  which 
is  intended  to  implement  S'. 

Programs  S'  and  5  are  called  the  abstract  and  concrete  programs,  re¬ 
spectively.  The  correspondence  between  the  states  of  these  programs  is 
represented  by  a  predicate  /  on  X  and  Y ,  called  a  coupling  invariant.  It 
often  takes  the  form  X  —  F{  T),  where  F  is  a  function  that  maps  any  state 
of  S  to  the  state  of  S'  that  it  implements. 

The  property  that  S  implements  S'  is  formalized  as 

{P  M)^wp{S,wp*{S',r)).  (2) 

Here,  wp{S,R)  denotes  the  weakest  precondition  of  5  with  respect  to  R 
[Dij76],  the  set  of  states  in  which  any  execution  of  S  is  guaranteed  to  ter¬ 
minate  in  a  state  satisfying  R,  and  wp*(S',R)  denotes  the  angelic  weakest 
precondition  of  S'  with  respect  to  R,  the  set  of  states  in  which  some  exe¬ 
cution  of  S'  will  terminate  in  a  state  satisfying  R.^  Thus,  (2)  specifies  a 
correspondence  between  the  effect  of  concrete  program  S  and  that  of  the 
abstract  program  S',  when  both  programs  are  viewed  as  predicate  trans¬ 
formers.  In  particular,  (2)  asserts  that  concrete  program  5  changes  its 
variables  in  a  way  that  is  consistent — as  defined  by  I — with  some  execution 
of  the  abstract  program  it  implements. 

*The  relationship  between  wp  and  uip*  is  as  follows: 
wp(S,  true)  =>  (uip'{S,R)  -<wp{S,  -'R)). 

For  deterministic  5,  wp(S,R)  wp*{S.R). 
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2.4  Serializability  as  “Implements” 

Serializability  of  E  =  (  V T)  can  be  formalized  in  terms  of  E'  =  (  V,  C,  T') 
using  (2)  by  taking  T'  of  (1)  to  be  the  abstract  program  and  T  to  be  the 
concrete  program.  To  do  so,  however,  variables  V  in  C  and  T'  must  be 
replaced  by  fresh  variables  disjoint  from  V.  Henceforth,  we 

assume  that  V  and  V  are  disjoint. 

For  P  in  (2)  we  take  C'  and  for  /  we  take  the  fol¬ 

lowing  characterization  of  serializability. 

Definition:  E  is  serializable  if  and  only  if 

(C' A  /\  Vi  =  vl)  =!>  wp{T,wp''{T',  /\  Vi  =  v')).  (3) 

p,6K 

Some  simplification  of  (3)  is  possible.  Since  I’d,  run  serially  in 

T',  i'.ny  execution  of  T'  will  be  equivalent  to  some  execution  of  a  sequential 
program  p  in  the  set 

S{T'):  {</>o;  |  <^i  is  a  transaction  of  T',  0  <  t  <  iV}. 

Thus,  for  any  predicate  R, 

wp^(T\R)<^  V  (4) 

Substituting  the  right-hand  side  of  (4)  into  (3)  gives  the  equivalent  formula 

(C' A  f\  Vi  =  v'i)=>  wp{T,  V  wp*{p,  f\  vi  =  vl)).  (5) 

f>€S{T')  vi^V 

When  transactions  of  T'  are  deterministic,  (3)  can  be  further  simplified. 
Each  p£S{T')  is  now  deterministic.  Since  wp  and  wp^  are  equivalent  for 
deterministic  programs,  (5)  is  equivalent  to 

(C' A  /\  Vi==v'i)  =>  wp{T,  y  iup(p,  /\  Vi  =  «')).  (6) 

'’i€V'  P€S{T')  Pi€V 

3  Proof  Techniques  for  Serializability 

3.1  Hoare’s  Logic 

Verifying  that  a  database  system  E  satisfies  any  of  (3),  (5)  or  (6)  presents 
a  difficult  problem:  the  weakest  precondition  of  a  concurrent  program  such 
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as  r  is  a  complicated  predicate  that  is  difficult  to  evaluate.  Hoare’s  logic 
[Hoa69]  provides  an  alternative  and  often  more  tractable  formalism  for  rea¬ 
soning  about  concurrent  programs.  A  triple  is  a  formula 

{P}S{Q}  (7) 

where  5  is  a  program  and  P  and  Q  are  predicates  on  variables  of  S.  Predi¬ 
cates  P  and  Q  are  called  assertions  with  P  designated  the  precondition  and 
Q  the  postcondition  of  5. 

The  triple  (7)  has  the  following  interpretation: 

K  execution  of  S  begins  in  a  state  satisfying  P  and  S  terminates, 
then  the  state  reached  will  satisfy  Q. 

Since  this  interpretation  implies  nothing  about  the  termination  of  5,  a  triple 
specifies  partial  correctness.  Axioms  and  inference  rules  of  Hoare’s  logic 
for  a  simple  sequential  programming  language  can  be  found  in  [Hoa69]. 
Additional  axioms  and  rules  for  concurrent  programs  are  given  in  [OG76]. 

3.2  Effective  Criteria  Serializability 

The  relationship  between  a  triple  and  wp  is 

Q=>wp{S,R)  iff  and  term{S,Q)  (8) 

for  any  predicates  Q  and  R  and  any  program  S,  where  term{S,  Q)  specifies 
that  S  is  guaranteed  to  terminate  when  started  in  a  state  satisfying  Q.  Thus, 
our  definition  of  serializability  in  Section  2.4  and  formulas  (3),  (5)  and  (6) 
imply  the  following  theorem  and  corollary. 

Theorem  1  Let  E  =  (  V,  C,  T)  be  a  database  system  and  E'  a  correspond¬ 
ing  abstract  system.  E  is  serializable  if  and  only  if 

Tl.l:  {C'A  /\  Vi  =  v-}T{  \/  wp‘{p,  /\  =  v-)}, 

r<€V  p€5(r')  Viev 

T1.2:  term{T,C'  A  f\  Vi  =  vl). 

t>,-6  V 


Corollary  2  If  the  transactions  of  T'  are  deterministic,  then  S  is  serializ¬ 
able  if  and  only  if 
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C2.1:  {C  A  /\  v,  =  vi}T{  V  Mp.  a  = 

e^ev  P65(T')  B.eV 

C2.2:  term(T,C'  A  A  = 

viev 

The  conditions  of  Theorem  1  and  Corollary  2  provide  effective  criteria  for 
verifying — using  assertional  reasoning — serializability  of  database  systems. 
Validity  of  Tl.l  and  C2.1  can  be  established  using  the  logic  of  [OG76]; 
validity  of  Tl.2  and  C2.2  can  be  established  using  temporal  logic  [Pnu81, 
MP81]. 

3.3  An  Example 

Returning  to  the  example  of  Figure  1,  note  that  transactions  of  Ep  are 
deterministic  since  those  of  Eq  are.  Thus,  Corollary  2  can  be  used  to  verify 
serializability  of  Eq  as  follows.  We  first  prove  condition  C2.1, 

{Co  A  A  Wi  =  v.'}ro{  V  A  (9) 

«i6V'o  P€5(ri)  ”ieVo 

We  abbreviate  the  formal  proof  of  (9)  with  the  following  proof  outline  [OG76]: 

(Cq  A  s  =  s'  A  (Vt:  0<i<N:  ini  =  in-)} 

{  Aq,  . . . ,  Ayv— 1 .  —  0, . . . ,0 ) , 

{/O  A  (Vi:  0<i<N:  Ai  =  0)} 

[PO(Addo)  II  ...  II  PO(Addu-i)]  flO) 

{10  A  (Vt:  0<i<N:  Ai  =  tn,)} 

{  V  wp{p,^vi^Vo^i  =  vl)} 

P€J(7-') 

Here,  each  PO(Addi)  is  the  proof  outline  shown  in  Figure  2  below,  and 

10:  (Vt;  0<i<N:  0CA,Cm,) 

As  =  (s'U  Ai)  A  (Vt:  0<i<N:  ini  =  in() 

0<i<N 

is  an  assertion  that  remains  true  throughout  execution  of  Tq.  Variables 
Ao,...,Ayv_i  are  auxiliary  variables  [OG76,  McC89].  They  have  type  set 
and  represent  the  difference  between  s  and  s'.  Since  they  are  used  for 
purposes  of  proof  only,  they  need  not  be  implemented. 

In  verifying  serializability  of  Eo,  we  next  prove  condition  C2.2  of  Corol¬ 
lary  2 — that  To  terminates  when  started  in  a  state  satisfying  Cg  A  /\„,g  v,  = 
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{10  A  A.  =  0} 

{ti-.=  0); 

{/O  A  0<t,<|m,(  A  A,  =  m,(0. . —  1)} 
do  i,  / I  in,  I  -♦ 

{/O  A  0<tj<|w,|  A  A;  =  m,(0. — 1)} 

(  s,Ai:=sUm,(<i),AiUm,(t,)); 

{10  A  0<<j<lmi|  A  Ai  = 

{t,:=t,  +  l) 

{70  A  0<ii<|mi|  A  A;  =  m,(0. .t,  —  1)} 

od 

{  70  A  A,  =  m,} 

Figure  2:  PO{Addi) 


V-.  Since  the  loop  in  each  Addj  executes  exactly  |tn,|  iterations,  Addi  ex¬ 
ecutes  a  bounded  number  of  operations,  each  of  which  is  an  assignment 
that  is  guaranteed  to  terminate.  Consequently,  each  Addi  terminates,  and 
it  follows  that  To  terminates. 

By  Corollary  2,  therefore,  So  is  a  serializable  database  system.  Notice 
that  an  explicit  concurrency  control  mechanism  was  not  needed  to  achieve 
this  serializability,  even  though  transactions  shared  access  to  variable  s. 
This,  then,  illustrates  how  our  work  can  be  used  to  prove  correctness  of  so¬ 
lutions  to  the  concurrency  control  problem  when  the  semantics  of  individual 
transactions  contribute  to  the  solution. 

The  preceding  example  is  misleading.  Although  effective,  the  criteria 
given  by  Theorem  1  and  its  corollary  are  not  practical  for  verifying  serializ¬ 
ability  of  a  database  system.  For  all  but  the  simplest  database  systems,  the 
assertions  used  will  be  too  large  for  a  proof  of  Tl.l  or  C2.1  lu  be  tractable, 
due  to  the  number  and  complexity  of  seriail  executions  p£S{T').  For  ex¬ 
ample,  consider  the  database  system  Ei  of  Figure  3.  E;  is  obtained  from 
Eo  by  adding  variables  outo,.-.,o«t/v-i  and  transactions  7fisfo,...,7istAr_i 
that  write  the  '•ontents  of  s  to  these  variables.  The  consistency  constraint 
has  been  strengthened  to  require  that  contents  written  by  a  List  transaction 
contain  all  or  none  of  the  elements  being  added  by  an  Add  transaction.  This 
has  made  it  necessary  to  synchronize  transactions  in  order  to  prevent  listing 
s  when  only  part  of  some  irii  has  been  added.  Synchronization  is  accom¬ 
plished  using  locking  [KS79,  Kor83j.  Transactions  synchronize  by  acquiring 
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El  =  (  Fi;  s.zrio, ...  ,m/v-i,ouio . 

Cl'.  Co  A  (Vt,j:  0<i,j  <  N:  out,  n  irij  =  irij  V  out,fMnj  =<b) 

A  (Vt,Af:  0<i<^iV:  -'locked(Addi,M)  A  locked  )} 

Ti:  [Adda  |1  •••  ||  Adds-x  ||  LisUi  ||  •.•  ||  Ltst/v-i]  ) 

Addi'.  (lock(S));  List,:  (lock(X)); 

(t,:=0);  {out,:-s)\ 

do  t,  5^  |m,l  — >  (unlock(X)) 

{s:=sUin,(t,))\ 

{t,:^t,  +  l) 

od; 

{  unlock(S)) 


Figure  3:  Database  System  Ei- 

and  releasing  a  lock,  which  can  have  either  shared  or  exclusive  mode,  denoted 
by  S  and  X,  respectively.  An  X-Iock  is  incompatible  with  other  locks,  so  a 
transaction  attempting  to  acquire  either  an  S-lock  or  an  X-lock  will  block 
until  no  other  transaction  holds  an  X-lock.^  To  ensure  that  transactions 
terminate  when  run  in  isolation,  the  consistency  constraint  requires  that 
transactions  hold  no  locks  initially.  We  denote  the  fact  that  a  transaction 
r,  holds  an  Af-lock  by  the  predicate  locked{Ti,M). 

Corollary  1  requires 

{C/  A  /\  v,  =  vl}Ti{  V  wpip,  y\  u,  =  «,')} 

riSVi  P^S{T{)  »,ev, 

to  be  valid.  In  the  postcondition  of  the  anaJogous  triple  (9)  for  To,  ail  dis- 
juncts  wp(p,  A<7j€  >0  )  were  equivalent.  For  Ei  of  Figure  3,  the  number 

of  different  disjuncts  will  be  exponential  in  N,  making  it  painful  to  verify. 

3.4  Simpler  Criteria  for  Serializability 

When  transactions  are  deterministic  (as  they  are  in  So  and  Ej),  simpler 
criteria  for  serializability  can  be  formulated  by  promoting  abstract  transac¬ 
tions  from  their  passive  role  in  the  postcondition  of  C2.1  to  a  more  active 

*The  semantics  of  operations  on  a  have  allowed  shared-mode  locks  to  be  used  in  trans¬ 
actions  Addi,  even  though  they  modify  s. 
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one.  Let  r  be  a  transaction  of  E  and  let  r'  be  the  corresponding  transaction 
of  an  abstract  system  for  E.  An  augmentation  of  r  is  a  program  r*  obtained 
by  substituting  (a;r')  for  some  atomic  operation  a  that  runs  exactly  once 
in  any  terminating  execution  of  r.  An  augmentation  of  T  is  a  program 

[^0  II  ’ ■  ■  II 

in  which  each  r*  is  an  augmentation  of  r,.  The  following  theorem  shows 
that  serializability  can  be  characterized  using  triples  for  augmentations. 

Theorem  3  Let  E  =  (  V^,C,T)  be  a  database  system  with  deterministic 
transactions  and  let  E'  =  (  V',C\T')  be  an  abstract  .s3'stem  for  E.  Let  T' 
be  an  augmentation  of  T  using  transactions  of  T'.  E  is  serializable  if 

T3.1:  {C' A  f\  ^.  =  0, 

T3.2:  term(T,C'  A  /\  u,  =  u- ). 

o.ev' 

Proof  Due  to  Corollary  2,  it  suffices  to  show  that  conditions  T3.1  and  T3.2 
imply  C2.1  and  C2.2.  Since  T3.2  and  C2.2.  are  identical,  it  suffices  to  show 
T3.1  and  T3.2  imply  C2.1. 

Assume  T3.1  holds  and  consider  a  terminating  execution  of  T  that  starts 
in  a  state  satisfying  the  precondition  of  C2.1.  Any  such  execution  can  be 
formally  represented  by  a  history  of  the  form 

cr:  so^  Si^  S2...3m-\  sm, 

where  so  is  the  initial  state,  and  ^  s,  denotes  that  atomic  action  q, 
transforms  state  s,_i  to  state  s,.  To  show  that  C2.1  is  valid,  it  suffices  to 
show  that  SM  satisfies  the  postcondition  of  C2.1. 

For  any  execution  <7  of  T,  the  construction  of  T'  implies  that  there  is  a 
corresponding  history 

l  3  •  •  iwf  m 

a  .  -Sq  — ►  >  52  .  • 

of  T*  in  which  Sg  =  sq  and  q*  is  either  or  (  Qi;r') ,  where  r'  is  the  abstract 
transaction  used  to  form  the  r*  that  contains  a*.  V^alidity  of  T3.1  implies 
that  s^f  satisfies  Ar.e  v 

Since  V  and  V  are  disjoint,  the  abstract  transactions  in  cr*  can  be  pulled 
from  their  atomic  operations  and  permuted  with  operations  a,  to  obtain  a 
history 
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a': 


/  /  ^0  /  /  ^.V-i  t 


of  T  followed  by  T'  in  which  sl)=3o,  s^+.v  =  ^.v/  . ^v-i  are  the 

abstract  transactions  of  T’  in  the  order  they  appear  in  cr*. 

Transactions  are  assumed  to  be  deterministic,  and  the  subsequence  of 
a'  from  Sq  to  s^  has  the  same  initial  state  and  operations  as  the  original 
execution  sequence  a.  Consequently,  s'j^  =  SM.  The  subsequence  of  a'  from 
Svf  to  is  one  of  the  pG  5(  T').  Since  =  ^\i  satisfies  /\„  ^  y  v,  =  v'. 

s'j^  satisfies  iwp(p, v-  u,  =  u/),  from  which  it  follows  tUat  s.v/  satisfies  the 
postcondition  of  C2.i.  □ 


I'he  conditions  given  by  Theorem  3  are  simpler  to  verify  than  those  of 
Theorem  2  or  Corollary  2,  due  to  shorter  assertions  in  the  proof  of  T3.1. 
However,  this  simplicity  has  been  acquired  at  the  expense  of  completeness. 
The  conditions  of  Theorem  1  and  its  corollary  are  equivalent  to  serializabil- 
ity,  while  those  of  Theorem  3  only  imply  it.  An  example  of  a  serializable 
database  system  for  which  T3.1  cannot  be  proven  is  given  in  [McC88]. 


3.5  Example  Revisited 

Theorem  3  can  be  used  to  prove  Ei  of  Figure  3  serializable,  as  follows.  The 
following  augmentations  of  each  Addi  and  LisU  are  used: 

Add':  {lock(S));  List':  (lock(X)); 

(t,:=0);  {out,:=s\Listl)\ 

doti/|xrq|  — ►  (unlock(X)) 

(  s:=sUm,(t,)); 

od; 

{  unlock(S);  Add/) 

We  first  prove  T3.1, 

{c/ A  A  A 

c,  €V'i  »,  €  f 

using  the  proof  outline 
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{<^1  A»..6Vi«i=<} 

( Ao,...,Ayv-i  :=0,...,0); 

{/I  A  (Vi:  0<i<iV:  A,=0)} 

[PO{Add^)\\...\\PO{Add;f_^)\\  (11) 

PO{List^)  II  ...  II  PO{Lisff,_^)] 

{/I  A  (Vi:  0<i<N:  A<  =  0)} 

{  A  r.  e  1^1  «.  =  «/} 

In  (11),  PO(Add*)  and  PO{List*)  are  the  proof  outlines  shown  in  Fig¬ 
ures  4  and  5,  and  71  is 

70 

A  (Vi:  0  <  i  <  IV:  out,  =  out,') 

A  (Vi:  Q<i  <  N:  A,  ^  0  ^  locked{Addi,S)) 

A  (Vi,_7:  0<i,j  <  N:  -i(locked(Addi,S)  A  locked{Listj,'X.))) 

A  (Vi,_;':  0<i^j<N:  ->{locked(Listi,X)  A  locked{Listj,X))) 

Auxiliary  variables  Ao,...,A/v-i  play  the  same  role  here  as  in  the  previous 
example.  The  proof  of  (11)  is  straightforward  and  is  omitted  here. 

Condition  T3.2  requires  Ti  to  terminate  when  started  in  a  state  satis¬ 
fying  C{  A  AtijgV'i  Vi  =  The  argument  for  this  is  analogous  to  that  used 
to  prove  termination  of  So  except  for  the  possibility  of  deadlock  introduced 
by  the  addition  of  locking.  Deadlock  is  impossible,  however,  since  locking  is 
two-phase  [EGLT76]. 

4  Extensions 

4.1  Modes  of  Termination 

The  database  system  model  presented  in  Section  2.1  ignores  certain  aspects 
of  actual  database  systems.  One  of  these  is  the  potential  for  transactions 
to  abort.  An  aborting  transaction  typically  executes  a  recovery  protocol  in 
which  operations  are  run  that  undo  the  effects  of  its  changes  to  the  database, 
thereby  giving  the  effect  that  it  never  ran. 

We  can  incorporate  this  mode  of  transaction  termination  into  our  system 
model  by  including  in  each  transaction  Ti  an  operation  modeling  its  recovery 
protocol  and  including  in  V  a  Boolean  variable  commiti,  initially  false,  that 
r,  sets  to  true  if  and  only  if  it  terminates  without  executing  its  recovery 
protocol.  This  change  in  the  system  model  necessitates  a  change  in  the 
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{/I  A  Ai  =  0  A  ->locked(Addi,S)} 

(lock(S)); 

{/I  A  A,  =  0  A  locked(Addi,S)} 

(t.:=0); 

{/I  A  0<ij<|in,|  A  Ai  =  — 1)  A  locked{Addi,S)} 

do  ti  ^  I  irii  I  — > 

{/I  A  0<i,<(m,|  A  A,  =  in, (0. .<,-!)  A  Zocfce(i(i4ddi,S)} 

(  ^5  A,' :=  s U  in,(i,), A,' U  in,(i,)) ; 

{/I  A  0<ii<|tnj|  A  A,  =  ini(0..ij)  A  locked{Addi,S)} 

{71  A  0<Z,<]tn,|  A  A,  =  inj(0..ti  — 1)  A  locked{Add„S)} 
od; 

{7l  A  A,  =  in,  A  locked(Addi,S)} 

{  unlock(S);  Aj  ;=0;  Add-) 

{71  A  A,  =  0  A  -ilocked{Addi,S)} 

Figure  4;  PO(Addi) 

definition  of  serializability  as  well.  Our  definition  specifies  in  (3)  that  every 
execution  of  T  corresponds  to  some  execution  of  T'.  However,  there  can 
be  executions  of  T  in  which  transactions  abort  for  reasons  that  are  not 
encountered  in  a  serial  execution  (e.g.,  deadlock),  and  no  execution  of  T' 
will  correspond  to  these. 

This  problem  is  circumvented  by  chosing 

T".  KII  ...  lu;.,] 

as  the  abstract  concurrent  program  for  S'  (instead  of  T'),  where  each  trans¬ 
action 

T-':  ( if  true  — »  r-  Q  true  — »  skip  fi) 

executes  one  of  r-  or  skip  when  run,  the  choice  being  made  nondetermin- 
istically.  If  skip  is  selected,  all  variables,  including  commit-,  will  be  left 
unchanged,  giving  the  same  effect  as  if  r-  ran  but  aborted.  Every  execution 
in  S(  r")  will  now  be  equivalent  to  some  serial  execution  p  in  the  set 

SS(  T'):  {4>q\ . .  .\4>k-\  1 0  <  fc  <  iV,  is  a  transaction  of  T',  0  <  i  <  k}. 
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{/I  A  -tlocked{List,,'S.)} 
{lock(X)); 

{71  A  lockediListi,^.)} 

( outi  :=s;  List'-); 

{/I  A  locked{Listi,'K)} 

{ unlock(X)) 

{/I  A  -ilocked{Listi,'X.)} 

Figure  5:  PO{List*) 


There  are  several  consequences  of  replacing  T'  by  T".  One  is  that  S{  T') 
must  be  replaced  by  SS(T')  in  (5)  and  formulas  derived  from  it.  Conse¬ 
quently,  an  implementation  of  E  that  aborts  every  transaction  will  be  seri¬ 
alizable  under  our  definition,  violating  constraints  on  transaction  progress 
that  are  often  assumed  of  databases.  This  can  be  avoided  by  specifying 
these  constraints  in  addition  to  serializability. 

Another  consequence  of  using  abstract  transactions  of  T"  is  that  even 
when  transactions  of  T  are  deterministic,  those  of  T"  are  not,  and  conse¬ 
quently  cannot  be  used  in  augmentations  to  prove  serializability  as  described 
in  Section  3.4.  Transactions  of  T'  can  still  be  used  in  augmentations,  how¬ 
ever,  as  long  as  each  r-  is  restricted  to  positions  where  it  runs  if  and  only  if 
r,  commits.  The  proof  of  Theorem  3  is  virtually  unchanged  by  this  restric¬ 
tion,  guaranteeing  that  serializability  is  still  ensured  by  conditions  T3.1  and 
T3.2. 


4.2  Views 

In  our  definition  of  serializability,  the  choice  of  the  coupling  invariant  reflects 
an  implicit  assumption  that  transaction  behavior  is  characterized  by  the 
entire  system  state.  This  may  be  too  strong.  Parts  of  the  state  of  a  real 
database  system  will  be  invisible  to  users  of  the  system  and  need  not  be 
included  when  considering  behavior.  For  example,  the  set  s  of  database  Si 
might  be  implemented  using  an  array  a  that  stores  elements  in  contiguous 
locations.  The  order  of  elements  in  this  array  should  not  be  considered  when 
determining  whether  or  not  execution  is  serializable. 

The  visible  aspects  of  the  database  system  state  are  an  abstraction  of  the 
system  state.  This  can  be  modeled  by  using  a  function  on  system  states  that 
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maps  indistinguishable  states  to  a  common  abstract  representation.  We  call 
such  a  function  a  view  function.  We  can  incorporate  a  view  function  /  into 
our  definition  of  serializability  by  replacing  the  original  coupling  invariant 
Ar,6V'>^i  =  by  =  /(«/)  (3),  together  with  formulas  derived 

from  it. 

5  Discussion 

We  have  defined  serializability  in  terms  of  concurrent  execution  of  transac¬ 
tions  implementing  serial  execution.  By  choosing  an  assertional  character¬ 
ization  of  “implements”,  serializability  was  expressed  using  Hoare’s  logic. 
This  makes  it  possible  to  verify  concurrency  control  mechanisms  using  that 
logic. 

There  are  many  other  definitions  of  serializability  [Pap86].  What  most 
of  these  definitions  have  in  common  is  that  serializability  is  defined  as  a 
property  of  system  schedules,  sequences  of  operations  resulting  from  partic¬ 
ular  system  executions.  Schedule-based  definitions  of  serializability  fall  into 
two  broad  categories  based  on  how  schedule  behavior  is  characterized:  state 
based  and  conflict  based. 

In  state-based  definitions,  system  behavior  is  described  in  terms  of  how 
schedules  transform  one  state  to  another.  Definitions  differ  with  respect 
to  the  parts  of  the  state  considered  significant.  A  schedule  is  final-state 
serializable  if  it  and  some  serial  schedule  transform  identical  initial  states  to 
finatl  states  that  agree  on  the  value  of  all  shared  variables.  A  schedule  is  view 
serializable  if  the  final  states  agree  on  the  values  obtained  by  read  operations 
as  well.  Both  final-state  and  view  serializable  schedules  can  be  expressed  by 
our  definition  of  serializability  by  suitable  choice  of  system  variables. 

Conflict-based  definitions  of  serializability  describe  behavior  somewhat 
indirectly,  using  conflict  relations  (also  known  as  dependency  relations)  on 
operations  of  the  schedule.  An  operation  oj  conflicts  with  another  oper¬ 
ation  02  in  the  same  schedule  (written  Qi<Q2)  if  (i)  the  operations  are 
from  different  transactions,  (ii)  oi  precedes  02,  and  (iii)  Oi  and  03  do  not 
commute  with  each  other  (i.e.,  the  same  initial  state  can  produce  different 
final  states  when  the  operations  are  run  in  different  orders).  The  conflict 
relation  on  operations  is  extended  to  one  on  transactions.  This,  then,  is 
used  to  determine  the  set  of  serial  schedules  exhibiting  the  same  behavior:  a 
schedule  is  conflict  serializable  if  it  and  some  serial  schedule  have  the  same 
conflict  relation  on  transactions. 
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T,2  =  {V2:  {x,y,out,b), 

Cj;  0<6<1  A  i  +  y  =  100, 

T2:  [UpdateXY  |(  ListX]  ) 

UpdateXY:  ao:  {x,y:=x  +  b*17,y—b*17)-, 

ait  {x,y:=x  +  b*17,y-b*l7) 

ListX:  (3q:  {out:=x) 

Figure  6:  Database  System  E2. 

Because  schedule- based  dehnitions  of  serializability  consider  operation 
sequences  and  system  states  independently,  some  database  systems  consid¬ 
ered  serializable  under  our  definition  are  not  considered  serializable  by  the 
schedule-based  definitions  above.  The  (somewhat  contrived)  database  sys¬ 
tem  S2  of  Figure  6  is  an  example.  There,  variables  x,  y,  and  out  hold  integer 
values,  while  b  holds  a  binary  value.  Transaction  UpdateXY  subtracts  17 
from  y  and  adds  it  to  i,  using  the  value  of  b  to  control  the  order  in  which  this 
occurs,  (b  denotes  the  complement  of  b.)  Using  the  techniques  of  Section  3, 
it  is  not  difficult  to  prove  S2  serializable  according  to  our  definition. 

Consider  the  schedule 

ao;/3o;ai-  (12) 

Note  that  the  value  read  by  ListX  depends  on  the  state  in  which  execution 
begins:  out  gets  the  same  value  as  it  does  from  the  serial  schedule  Qo;ai;/?o  if 
6  =  1  initially,  while  out  gets  the  same  vzdue  as  from  if  6  =  0.  Thus, 

schedule  (12)  is  neither  final-state  nor  view  serializable  since  these  require 
(12)  to  “behave  like”  one  of  the  serial  schedules  for  all  initial  states.  Nor  is 
(12)  conflict  serializable,  since  the  conflict  relation  on  UpdateXY  and  ListX 
has  a  cycle.  Thus,  although  S2  is  serializable  according  to  our  definition,  it 
is  not  serializable  by  the  schedule-based  definitions  of  serializability. 

In  [Cas8l],  a  definition  of  serializability  similar  to  ours  is  given.  This 
deflnition  uses  operators  of  Concurrent  Dynamic  Logic  (CDL)  instead  of 
weakest  preconditions  to  express  the  “implements”  relationship  between  T 
and  its  serial  model  T'.  Our  definition  is  more  general  than  the  CDL  defi¬ 
nition,  however,  because  the  coupling  invariant  can  be  written  in  terms  of  a 
view  function.  Our  definition  also  provides  more  useful  criteria  for  verifying 
serializability,  since  Hoare’s  logic  offers  a  variety  of  formal  techniques  for 
deriving  and  verifying  triples  that  CDL  currently  lacks. 
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